Instruction Scheduling for Instruction Level Parallel Processors

نویسنده

  • PAOLO FARABOSCHI
چکیده

Nearly all personal computer and workstation processors, and virtually all high-performance embedded processor cores, now embody instruction level parallel (ILP) processing in the form of superscalar or very long instruction word (VLIW) architectures. ILP processors put much more of a burden on compilers; without “heroic” compiling techniques, most such processors fall far short of their performance goals. Those techniques are largely found in the high-level optimization phase and in the code generation phase; they are also collectively called instruction scheduling. This paper reviews the state of the art in code generation for ILP parallel processors. Modern ILP code generation methods move code across basic block boundaries. These methods grew out of techniques for generating horizontal microcode, so we introduce the problem by describing its history. Most modern approaches can be categorized by the shape of the scheduling “region.” Some of these regions are loops, and for those techniques known broadly as “Software Pipelining” are used. Software Pipelining techniques are only considered here when there are issues relevant to the region-based techniques presented. The selection of a type of region to use in this process is one of the most controversial questions in code generation; the paper surveys the best known alternatives. The paper then considers two questions: First, given a type of region, how does one pick specific regions of that type in the intermediate code. In conjunction with region selection, we consider region enlargement techniques such as unrolling and branch target expansion. The second question, how does one construct a schedule once regions have been selected, occupies the next section of the paper. Finally, schedule construction using recent, innovative resource modeling based on finite-state automata is then reexamined. The paper includes an extensive bibliography.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Treegion Scheduling for Highly Parallel Processors

Instruction scheduling is a compile-time technique for extracting parallelism from programs for statically scheduled instruction-level parallel processors. Typically, an instruction scheduler partitions a program into regions and then schedules each region. One style of region represents a program as a set of decision trees or treegions. The non-linear nature of the treegion allows scheduling a...

متن کامل

A Theory for Software-Hardware Co-Scheduling for ASIPs and Embedded Processors

Exploiting instruction-level parallelism (ILP) is extremely important for achieving high performance in application specific instruction set processors (ASIPs) and embedded processors. Existing techniques deal with either scheduling hardware pipelines to obtain higher throughput or software pipeline — an instruction scheduling technique for iterative computation — loops for exploiting greater I...

متن کامل

Registers On Demand, an integrated region scheduler and register allocator

Two of the most important phases of code generation for instruction level parallel processors are register allocation and instruction scheduling. Applying these two phase separately has major drawbacks like the introduction of false dependences, or a higher register pressure and thus more spill code. In this paper we present a new method which integrates register allocation and region schedulin...

متن کامل

Jetpipeline: a Hybrid Pipeline Architecture for Instruction-level Parallelism

High performance processors based on pipeline processing play an important role in scientific and engineering computation. However, it is difficult to gain a satisfactory solution when taking both high degree of flexibility of parallel processing and low hardware complexity into account. This paper propose a hybrid pipeline architecture named Jetpipeline that possesses high degree of flexibilit...

متن کامل

Dynamic Scheduling Techniques for VLIW Processors

instruction-level parallelism, VLIW processors, superscalar processors, pipelining, multiple operation issue, scoreboarding, dynamic scheduling, out-of -order execution VLIW processors are viewed as an attractive way of achieving instruction-level parallelism because of their ability to issue multiple operations per cycle with relatively simple control logic. They are also perceived as being of...

متن کامل

Aligned Scheduling: Cache-Efficient Instruction Scheduling for VLIW Processors

The performance of statically scheduled VLIW processors is highly sensitive to the instruction scheduling performed by the compiler. In this work we identify a major deficiency in existing instruction scheduling for VLIW processors. Unlike most dynamically scheduled processors, a VLIW processor with no load-use hardware interlocks will completely stall upon a cache-miss of any of the operations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001